TIME: Text and Image Mutual-Translation Adversarial Networks

نویسندگان

چکیده

Focusing on text-to-image (T2I) generation, we propose Text and Image Mutual-Translation Adversarial Networks (TIME), a lightweight but effective model that jointly learns T2I generator G an image captioning discriminator D under the Generative Network framework. While previous methods tackle problem as uni-directional task use pre-trained language models to enforce image--text consistency, TIME requires neither extra modules nor pre-training. We show performance of can be boosted substantially by training it with model. Specifically, adopt Transformers cross-modal connections between features word embeddings, design annealing conditional hinge loss dynamically balances adversarial learning. In our experiments, achieves state-of-the-art (SOTA) CUB dataset (Inception Score 4.91 Fréchet Inception Distance 14.3 CUB), shows promising MS-COCO downstream vision-language tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improvement of generative adversarial networks for automatic text-to-image generation

This research is related to the use of deep learning tools and image processing technology in the automatic generation of images from text. Previous researches have used one sentence to produce images. In this research, a memory-based hierarchical model is presented that uses three different descriptions that are presented in the form of sentences to produce and improve the image. The proposed ...

متن کامل

Unsupervised Image-to-Image Translation with Generative Adversarial Networks

It’s useful to automatically transform an image from its original form to some synthetic form (style, partial contents, etc.), while keeping the original structure or semantics. We define this requirement as the ”image-to-image translation” problem, and propose a general approach to achieve it, based on deep convolutional and conditional generative adversarial networks (GANs), which has gained ...

متن کامل

Generative Adversarial Text to Image Synthesis

Automatic synthesis of realistic images from text would be interesting and useful, but current AI systems are still far from this goal. However, in recent years generic and powerful recurrent neural network architectures have been developed to learn discriminative text feature representations. Meanwhile, deep convolutional generative adversarial networks (GANs) have begun to generate highly com...

متن کامل

Discriminative Region Proposal Adversarial Networks for High-Quality Image-to-Image Translation

Image-to-image translation has been made much progress with embracing Generative Adversarial Networks (GANs). However, it’s still very challenging for translation tasks that require high-quality, especially at highresolution and photo-reality. In this paper, we present Discriminative Region Proposal Adversarial Networks (DRPANs) with three components: a generator, a discriminator and a reviser,...

متن کامل

In2I : Unsupervised Multi-Image-to-Image Translation Using Generative Adversarial Networks

In unsupervised image-to-image translation, the goal is to learn the mapping between an input image and an output image using a set of unpaired training images. In this paper, we propose an extension of the unsupervised image-toimage translation problem to multiple input setting. Given a set of paired images from multiple modalities, a transformation is learned to translate the input into a spe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i3.16305